Overview

Dataset statistics

Number of variables29
Number of observations91796
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory20.3 MiB
Average record size in memory232.0 B

Variable types

CAT14
NUM13
BOOL2

Warnings

crash_date has a high cardinality: 492 distinct values High cardinality
crash_time has a high cardinality: 1440 distinct values High cardinality
zip_code has a high cardinality: 204 distinct values High cardinality
location has a high cardinality: 44604 distinct values High cardinality
on_street_name has a high cardinality: 3692 distinct values High cardinality
contributing_factor_vehicle_1 has a high cardinality: 54 distinct values High cardinality
number_of_motorist_injured is highly correlated with number_of_persons_injured and 1 other fieldsHigh correlation
number_of_persons_injured is highly correlated with number_of_motorist_injured and 1 other fieldsHigh correlation
year is highly correlated with collision_idHigh correlation
collision_id is highly correlated with yearHigh correlation
victim is highly correlated with number_of_persons_injured and 1 other fieldsHigh correlation
number_of_motorist_killed is highly correlated with number_of_persons_killedHigh correlation
number_of_persons_killed is highly correlated with number_of_motorist_killedHigh correlation
longitude is highly skewed (γ1 = -149.3269464) Skewed
collision_id has unique values Unique
number_of_persons_injured has 66819 (72.8%) zeros Zeros
number_of_pedestrians_injured has 87467 (95.3%) zeros Zeros
number_of_motorist_injured has 75522 (82.3%) zeros Zeros
vehicle_type_code1_num has 42899 (46.7%) zeros Zeros
vehicle_type_code2_num has 32950 (35.9%) zeros Zeros
hour has 3788 (4.1%) zeros Zeros
victim has 66699 (72.7%) zeros Zeros

Reproduction

Analysis started2020-12-11 09:16:19.909233
Analysis finished2020-12-11 09:16:50.063289
Duration30.15 seconds
Software versionpandas-profiling v2.9.0
Download configurationconfig.yaml

Variables

collision_id
Real number (ℝ≥0)

HIGH CORRELATION
UNIQUE

Distinct91796
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4225941.687
Minimum225766
Maximum4353706
Zeros0
Zeros (%)0.0%
Memory size717.2 KiB
2020-12-11T10:16:50.154310image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum225766
5-th percentile3665120.5
Q14182167.75
median4300433.5
Q34328281.25
95-th percentile4348298.25
Maximum4353706
Range4127940
Interquartile range (IQR)146113.5

Descriptive statistics

Standard deviation161127.838
Coefficient of variation (CV)0.03812826819
Kurtosis10.94913254
Mean4225941.687
Median Absolute Deviation (MAD)51708
Skewness-2.728466752
Sum3.879245431e+11
Variance2.596218018e+10
MonotocityNot monotonic
2020-12-11T10:16:50.253332image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
43274231< 0.1%
 
43220571< 0.1%
 
43423011< 0.1%
 
43484461< 0.1%
 
43463991< 0.1%
 
43033921< 0.1%
 
41702731< 0.1%
 
43074901< 0.1%
 
41743711< 0.1%
 
41620851< 0.1%
 
Other values (91786)91786> 99.9%
 
ValueCountFrequency (%) 
2257661< 0.1%
 
28098081< 0.1%
 
28747751< 0.1%
 
28793671< 0.1%
 
29129811< 0.1%
 
ValueCountFrequency (%) 
43537061< 0.1%
 
43537051< 0.1%
 
43536721< 0.1%
 
43536631< 0.1%
 
43536601< 0.1%
 

crash_date
Categorical

HIGH CARDINALITY

Distinct492
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size717.2 KiB
2019-07-16
 
619
2019-07-19
 
614
2019-08-08
 
604
2019-07-29
 
600
2019-07-26
 
593
Other values (487)
88766 
ValueCountFrequency (%) 
2019-07-166190.7%
 
2019-07-196140.7%
 
2019-08-086040.7%
 
2019-07-296000.7%
 
2019-07-265930.6%
 
2019-07-155930.6%
 
2019-09-035910.6%
 
2019-07-225880.6%
 
2019-07-305840.6%
 
2019-07-185820.6%
 
Other values (482)8582893.5%
 
2020-12-11T10:16:50.375360image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique102 ?
Unique (%)0.1%
2020-12-11T10:16:50.478382image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length10
Median length10
Mean length10
Min length10

crash_time
Categorical

HIGH CARDINALITY

Distinct1440
Distinct (%)1.6%
Missing0
Missing (%)0.0%
Memory size717.2 KiB
0:00
 
1494
17:00
 
1253
16:00
 
1251
14:00
 
1212
18:00
 
1145
Other values (1435)
85441 
ValueCountFrequency (%) 
0:0014941.6%
 
17:0012531.4%
 
16:0012511.4%
 
14:0012121.3%
 
18:0011451.2%
 
15:0011321.2%
 
13:0010861.2%
 
12:0010391.1%
 
19:009351.0%
 
9:008951.0%
 
Other values (1430)8035487.5%
 
2020-12-11T10:16:50.584407image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-12-11T10:16:50.689431image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length5
Median length5
Mean length4.745348381
Min length4

borough
Categorical

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size717.2 KiB
BROOKLYN
27388 
QUEENS
22282 
BRONX
14475 
MANHATTAN
13866 
Unspecified
11354 
ValueCountFrequency (%) 
BROOKLYN2738829.8%
 
QUEENS2228224.3%
 
BRONX1447515.8%
 
MANHATTAN1386615.1%
 
Unspecified1135412.4%
 
STATEN ISLAND24312.6%
 
2020-12-11T10:16:50.783451image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-12-11T10:16:50.843464image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:50.960995image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length13
Median length8
Mean length7.695999826
Min length5

zip_code
Categorical

HIGH CARDINALITY

Distinct204
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size717.2 KiB
Unspecified
11357 
11207.0
 
1939
11236.0
 
1333
11208.0
 
1300
11212.0
 
1263
Other values (199)
74604 
ValueCountFrequency (%) 
Unspecified1135712.4%
 
11207.019392.1%
 
11236.013331.5%
 
11208.013001.4%
 
11212.012631.4%
 
11385.012071.3%
 
11203.011891.3%
 
11434.011321.2%
 
11226.010771.2%
 
11234.010691.2%
 
Other values (194)6893075.1%
 
2020-12-11T10:16:51.080022image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique9 ?
Unique (%)< 0.1%
2020-12-11T10:16:51.191046image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length11
Median length7
Mean length7.494879951
Min length7

latitude
Real number (ℝ≥0)

Distinct33674
Distinct (%)36.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean40.72675874
Minimum40.501465
Maximum40.91217
Zeros0
Zeros (%)0.0%
Memory size717.2 KiB
2020-12-11T10:16:51.310073image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum40.501465
5-th percentile40.60102875
Q140.668167
median40.717926
Q340.785854
95-th percentile40.86434525
Maximum40.91217
Range0.410705
Interquartile range (IQR)0.117687

Descriptive statistics

Standard deviation0.08082751098
Coefficient of variation (CV)0.001984629111
Kurtosis-0.7352348647
Mean40.72675874
Median Absolute Deviation (MAD)0.052966
Skewness0.1662156864
Sum3738553.545
Variance0.006533086532
MonotocityIncreasing
2020-12-11T10:16:51.409096image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
40.861862790.1%
 
40.8047560.1%
 
40.820305520.1%
 
40.696033480.1%
 
40.675735480.1%
 
40.658577470.1%
 
40.73778545< 0.1%
 
40.65186343< 0.1%
 
40.6596542< 0.1%
 
40.73353641< 0.1%
 
Other values (33664)9129599.5%
 
ValueCountFrequency (%) 
40.5014651< 0.1%
 
40.503311< 0.1%
 
40.5033871< 0.1%
 
40.5034141< 0.1%
 
40.504471< 0.1%
 
ValueCountFrequency (%) 
40.912171< 0.1%
 
40.9121171< 0.1%
 
40.9120181< 0.1%
 
40.910381< 0.1%
 
40.910322< 0.1%
 

longitude
Real number (ℝ)

SKEWED

Distinct26493
Distinct (%)28.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-73.91783051
Minimum-201.23706
Maximum-73.700584
Zeros0
Zeros (%)0.0%
Memory size717.2 KiB
2020-12-11T10:16:51.518120image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum-201.23706
5-th percentile-74.0157625
Q1-73.96098
median-73.918304
Q3-73.86333875
95-th percentile-73.76194875
Maximum-73.700584
Range127.536476
Interquartile range (IQR)0.09764125

Descriptive statistics

Standard deviation0.8444969603
Coefficient of variation (CV)-0.01142480717
Kurtosis22511.0176
Mean-73.91783051
Median Absolute Deviation (MAD)0.048179
Skewness-149.3269464
Sum-6785361.169
Variance0.7131751159
MonotocityNot monotonic
2020-12-11T10:16:51.628145image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
-73.91282830.1%
 
-73.89063730.1%
 
-73.91243610.1%
 
-73.89083580.1%
 
-73.89686530.1%
 
-73.98453520.1%
 
-73.93755470.1%
 
-73.86536460.1%
 
-73.96191460.1%
 
-73.882744460.1%
 
Other values (26483)9123199.4%
 
ValueCountFrequency (%) 
-201.237064< 0.1%
 
-74.2530061< 0.1%
 
-74.2508241< 0.1%
 
-74.250761< 0.1%
 
-74.250151< 0.1%
 
ValueCountFrequency (%) 
-73.7005841< 0.1%
 
-73.700731< 0.1%
 
-73.700991< 0.1%
 
-73.7010041< 0.1%
 
-73.701291< 0.1%
 

location
Categorical

HIGH CARDINALITY

Distinct44604
Distinct (%)48.6%
Missing0
Missing (%)0.0%
Memory size717.2 KiB
(40.861862, -73.91282)
 
79
(40.8047, -73.91243)
 
55
(40.820305, -73.89083)
 
52
(40.696033, -73.98453)
 
48
(40.675735, -73.89686)
 
48
Other values (44599)
91514 
ValueCountFrequency (%) 
(40.861862, -73.91282)790.1%
 
(40.8047, -73.91243)550.1%
 
(40.820305, -73.89083)520.1%
 
(40.696033, -73.98453)480.1%
 
(40.675735, -73.89686)480.1%
 
(40.658577, -73.89063)470.1%
 
(40.737785, -73.93496)43< 0.1%
 
(40.733536, -73.87035)41< 0.1%
 
(40.66496, -73.82226)40< 0.1%
 
(40.60567, -74.030945)39< 0.1%
 
Other values (44594)9130499.5%
 
2020-12-11T10:16:51.813187image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique29003 ?
Unique (%)31.6%
2020-12-11T10:16:51.929213image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length25
Median length22
Mean length21.73174212
Min length17

on_street_name
Categorical

HIGH CARDINALITY

Distinct3692
Distinct (%)4.0%
Missing0
Missing (%)0.0%
Memory size717.2 KiB
Unspecified
25128 
BELT PARKWAY
 
1455
LONG ISLAND EXPRESSWAY
 
1000
BROOKLYN QUEENS EXPRESSWAY
 
889
BROADWAY
 
843
Other values (3687)
62481 
ValueCountFrequency (%) 
Unspecified2512827.4%
 
BELT PARKWAY14551.6%
 
LONG ISLAND EXPRESSWAY10001.1%
 
BROOKLYN QUEENS EXPRESSWAY8891.0%
 
BROADWAY8430.9%
 
GRAND CENTRAL PKWY7970.9%
 
FDR DRIVE7970.9%
 
ATLANTIC AVENUE6900.8%
 
MAJOR DEEGAN EXPRESSWAY6350.7%
 
CROSS BRONX EXPY6240.7%
 
Other values (3682)5893864.2%
 
2020-12-11T10:16:52.047240image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique907 ?
Unique (%)1.0%
2020-12-11T10:16:52.158264image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length32
Median length12
Mean length13.3261689
Min length6

number_of_persons_injured
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct13
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.3685672578
Minimum0
Maximum15
Zeros66819
Zeros (%)72.8%
Memory size717.2 KiB
2020-12-11T10:16:52.241283image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q31
95-th percentile2
Maximum15
Range15
Interquartile range (IQR)1

Descriptive statistics

Standard deviation0.7379089761
Coefficient of variation (CV)2.002101273
Kurtosis17.05818929
Mean0.3685672578
Median Absolute Deviation (MAD)0
Skewness3.168145535
Sum33833
Variance0.544509657
MonotocityNot monotonic
2020-12-11T10:16:52.313299image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=13)
ValueCountFrequency (%) 
06681972.8%
 
11934521.1%
 
236894.0%
 
311801.3%
 
44690.5%
 
51720.2%
 
6680.1%
 
731< 0.1%
 
812< 0.1%
 
95< 0.1%
 
Other values (3)6< 0.1%
 
ValueCountFrequency (%) 
06681972.8%
 
11934521.1%
 
236894.0%
 
311801.3%
 
44690.5%
 
ValueCountFrequency (%) 
151< 0.1%
 
113< 0.1%
 
102< 0.1%
 
95< 0.1%
 
812< 0.1%
 

number_of_persons_killed
Categorical

HIGH CORRELATION

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size717.2 KiB
0
91630 
1
 
160
2
 
5
3
 
1
ValueCountFrequency (%) 
09163099.8%
 
11600.2%
 
25< 0.1%
 
31< 0.1%
 
2020-12-11T10:16:52.405320image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique1 ?
Unique (%)< 0.1%
2020-12-11T10:16:52.462333image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:52.539350image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length1
Median length1
Mean length1
Min length1

number_of_pedestrians_injured
Real number (ℝ≥0)

ZEROS

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.0491415748
Minimum0
Maximum6
Zeros87467
Zeros (%)95.3%
Memory size717.2 KiB
2020-12-11T10:16:52.608366image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum6
Range6
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.2272210757
Coefficient of variation (CV)4.623805335
Kurtosis36.85800898
Mean0.0491415748
Median Absolute Deviation (MAD)0
Skewness5.157088489
Sum4511
Variance0.05162941725
MonotocityNot monotonic
2020-12-11T10:16:52.674381image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%) 
08746795.3%
 
141754.5%
 
21340.1%
 
317< 0.1%
 
62< 0.1%
 
51< 0.1%
 
ValueCountFrequency (%) 
08746795.3%
 
141754.5%
 
21340.1%
 
317< 0.1%
 
51< 0.1%
 
ValueCountFrequency (%) 
62< 0.1%
 
51< 0.1%
 
317< 0.1%
 
21340.1%
 
141754.5%
 
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size717.2 KiB
0
91737 
1
 
59
ValueCountFrequency (%) 
09173799.9%
 
1590.1%
 
2020-12-11T10:16:52.732394image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size717.2 KiB
0
87231 
1
 
4470
2
 
94
3
 
1
ValueCountFrequency (%) 
08723195.0%
 
144704.9%
 
2940.1%
 
31< 0.1%
 
2020-12-11T10:16:52.796408image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique1 ?
Unique (%)< 0.1%
2020-12-11T10:16:52.853421image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:52.931441image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length1
Median length1
Mean length1
Min length1
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size717.2 KiB
0
91776 
1
 
20
ValueCountFrequency (%) 
091776> 99.9%
 
120< 0.1%
 
2020-12-11T10:16:52.984453image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

number_of_motorist_injured
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct13
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.2686718376
Minimum0
Maximum15
Zeros75522
Zeros (%)82.3%
Memory size717.2 KiB
2020-12-11T10:16:53.039466image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile2
Maximum15
Range15
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.7034711697
Coefficient of variation (CV)2.61832865
Kurtosis22.65571798
Mean0.2686718376
Median Absolute Deviation (MAD)0
Skewness3.879183065
Sum24663
Variance0.4948716866
MonotocityNot monotonic
2020-12-11T10:16:53.114482image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=13)
ValueCountFrequency (%) 
07552282.3%
 
11102112.0%
 
233673.7%
 
311351.2%
 
44680.5%
 
51670.2%
 
6640.1%
 
729< 0.1%
 
812< 0.1%
 
95< 0.1%
 
Other values (3)6< 0.1%
 
ValueCountFrequency (%) 
07552282.3%
 
11102112.0%
 
233673.7%
 
311351.2%
 
44680.5%
 
ValueCountFrequency (%) 
151< 0.1%
 
113< 0.1%
 
102< 0.1%
 
95< 0.1%
 
812< 0.1%
 

number_of_motorist_killed
Categorical

HIGH CORRELATION

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size717.2 KiB
0
91709 
1
 
81
2
 
5
3
 
1
ValueCountFrequency (%) 
09170999.9%
 
1810.1%
 
25< 0.1%
 
31< 0.1%
 
2020-12-11T10:16:53.204503image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique1 ?
Unique (%)< 0.1%
2020-12-11T10:16:53.261516image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:53.338533image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length1
Median length1
Mean length1
Min length1

contributing_factor_vehicle_1
Categorical

HIGH CARDINALITY

Distinct54
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size717.2 KiB
Unspecified
23818 
Driver Inattention
23816 
Following Too Closely
6350 
Failure to Yield Right-of-Way
5650 
Backing Unsafely
3878 
Other values (49)
28284 
ValueCountFrequency (%) 
Unspecified2381825.9%
 
Driver Inattention2381625.9%
 
Following Too Closely63506.9%
 
Failure to Yield Right-of-Way56506.2%
 
Backing Unsafely38784.2%
 
Passing or Lane Usage Improper36073.9%
 
Passing Too Closely34743.8%
 
Other Vehicular27883.0%
 
Unsafe Speed21862.4%
 
Unsafe Lane Changing21732.4%
 
Other values (44)1405615.3%
 
2020-12-11T10:16:53.444557image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique2 ?
Unique (%)< 0.1%
2020-12-11T10:16:53.555582image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length32
Median length18
Mean length17.51903133
Min length5
Distinct44
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size717.2 KiB
Unspecified
58526 
Distraction
25055 
Other Vehicular
 
958
Following Too Closely
 
889
Limited
 
881
Other values (39)
 
5487
ValueCountFrequency (%) 
Unspecified5852663.8%
 
Distraction2505527.3%
 
Other Vehicular9581.0%
 
Following Too Closely8891.0%
 
Limited8811.0%
 
Bicyclist8020.9%
 
Road Rage6490.7%
 
Passing or Lane Usage Improper5790.6%
 
Failure to Yield Right-of-Way5060.6%
 
Passing Too Closely4390.5%
 
Other values (34)25122.7%
 
2020-12-11T10:16:53.670608image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique4 ?
Unique (%)< 0.1%
2020-12-11T10:16:53.788634image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length32
Median length11
Mean length11.48012985
Min length5
Distinct23
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size717.2 KiB
Sedan
42899 
Station Wagon
32950 
Taxi
 
3236
Pick-up Truck
 
2372
Box Truck
 
1804
Other values (18)
8535 
ValueCountFrequency (%) 
Sedan4289946.7%
 
Station Wagon3295035.9%
 
Taxi32363.5%
 
Pick-up Truck23722.6%
 
Box Truck18042.0%
 
Other13351.5%
 
Bike12601.4%
 
Bus10501.1%
 
Motorcycle8380.9%
 
Unspecified7070.8%
 
Other values (13)33453.6%
 
2020-12-11T10:16:53.895658image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-12-11T10:16:53.996681image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length22
Median length5
Mean length8.360647523
Min length2
Distinct19
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size717.2 KiB
Sport Utility Vehicle
32950 
Sedan
19009 
Unspecified
16303 
Station Wagon/Sport Utility Vehicle
12927 
Bike
 
2113
Other values (14)
8494 
ValueCountFrequency (%) 
Sport Utility Vehicle3295035.9%
 
Sedan1900920.7%
 
Unspecified1630317.8%
 
Station Wagon/Sport Utility Vehicle1292714.1%
 
Bike21132.3%
 
Taxi15421.7%
 
Other14421.6%
 
Pick-up Truck14151.5%
 
Box Truck13351.5%
 
Bus6020.7%
 
Other values (9)21582.4%
 
2020-12-11T10:16:54.104706image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-12-11T10:16:54.208729image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length35
Median length20
Mean length16.27900998
Min length3

vehicle_type_code1_num
Real number (ℝ≥0)

ZEROS

Distinct23
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.468571615
Minimum0
Maximum22
Zeros42899
Zeros (%)46.7%
Memory size717.2 KiB
2020-12-11T10:16:54.301750image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median1
Q31
95-th percentile8
Maximum22
Range22
Interquartile range (IQR)1

Descriptive statistics

Standard deviation2.996837902
Coefficient of variation (CV)2.040648117
Kurtosis14.50865548
Mean1.468571615
Median Absolute Deviation (MAD)1
Skewness3.569528408
Sum134809
Variance8.981037411
MonotocityNot monotonic
2020-12-11T10:16:54.381768image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=23)
ValueCountFrequency (%) 
04289946.7%
 
13295035.9%
 
232363.5%
 
323722.6%
 
418042.0%
 
513351.5%
 
612601.4%
 
710501.1%
 
88380.9%
 
97070.8%
 
Other values (13)33453.6%
 
ValueCountFrequency (%) 
04289946.7%
 
13295035.9%
 
232363.5%
 
323722.6%
 
418042.0%
 
ValueCountFrequency (%) 
22880.1%
 
211140.1%
 
201240.1%
 
191540.2%
 
181560.2%
 

vehicle_type_code2_num
Real number (ℝ≥0)

ZEROS

Distinct19
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.964333958
Minimum0
Maximum20
Zeros32950
Zeros (%)35.9%
Memory size717.2 KiB
2020-12-11T10:16:54.736814image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median1
Q33
95-th percentile8
Maximum20
Range20
Interquartile range (IQR)3

Descriptive statistics

Standard deviation2.812686761
Coefficient of variation (CV)1.431878092
Kurtosis9.223237086
Mean1.964333958
Median Absolute Deviation (MAD)1
Skewness2.707249657
Sum180318
Variance7.911206813
MonotocityNot monotonic
2020-12-11T10:16:54.820833image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=19)
ValueCountFrequency (%) 
03295035.9%
 
11900920.7%
 
21630317.8%
 
31292714.1%
 
621132.3%
 
515421.7%
 
714421.6%
 
814151.5%
 
913351.5%
 
106020.7%
 
Other values (9)21582.4%
 
ValueCountFrequency (%) 
03295035.9%
 
11900920.7%
 
21630317.8%
 
31292714.1%
 
515421.7%
 
ValueCountFrequency (%) 
20970.1%
 
191370.1%
 
181460.2%
 
161820.2%
 
151820.2%
 

day
Real number (ℝ≥0)

Distinct31
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean16.32389211
Minimum1
Maximum31
Zeros0
Zeros (%)0.0%
Memory size717.2 KiB
2020-12-11T10:16:54.917856image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q18
median17
Q324
95-th percentile30
Maximum31
Range30
Interquartile range (IQR)16

Descriptive statistics

Standard deviation8.951941846
Coefficient of variation (CV)0.5483950633
Kurtosis-1.22013628
Mean16.32389211
Median Absolute Deviation (MAD)8
Skewness-0.1230531982
Sum1498468
Variance80.13726281
MonotocityNot monotonic
2020-12-11T10:16:55.006875image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=31)
ValueCountFrequency (%) 
1935313.8%
 
2634533.8%
 
2233673.7%
 
2033503.6%
 
2733463.6%
 
2433273.6%
 
2132883.6%
 
2332853.6%
 
2932683.6%
 
332533.5%
 
Other values (21)5832863.5%
 
ValueCountFrequency (%) 
130143.3%
 
229383.2%
 
332533.5%
 
430533.3%
 
530343.3%
 
ValueCountFrequency (%) 
3119402.1%
 
3029973.3%
 
2932683.6%
 
2832043.5%
 
2733463.6%
 

month
Real number (ℝ≥0)

Distinct11
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean7.012549566
Minimum2
Maximum12
Zeros0
Zeros (%)0.0%
Memory size717.2 KiB
2020-12-11T10:16:55.097896image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum2
5-th percentile4
Q16
median7
Q38
95-th percentile9
Maximum12
Range10
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.820777322
Coefficient of variation (CV)0.2596455546
Kurtosis0.02298307966
Mean7.012549566
Median Absolute Deviation (MAD)1
Skewness-0.2483054528
Sum643724
Variance3.315230055
MonotocityNot monotonic
2020-12-11T10:16:55.177914image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=11)
ValueCountFrequency (%) 
82482127.0%
 
72172823.7%
 
91200413.1%
 
61069911.7%
 
578398.5%
 
471377.8%
 
337714.1%
 
1132503.5%
 
124530.5%
 
10500.1%
 
ValueCountFrequency (%) 
244< 0.1%
 
337714.1%
 
471377.8%
 
578398.5%
 
61069911.7%
 
ValueCountFrequency (%) 
124530.5%
 
1132503.5%
 
10500.1%
 
91200413.1%
 
82482127.0%
 

year
Real number (ℝ≥0)

HIGH CORRELATION

Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2019.377696
Minimum2013
Maximum2020
Zeros0
Zeros (%)0.0%
Memory size717.2 KiB
2020-12-11T10:16:55.253931image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum2013
5-th percentile2017
Q12019
median2020
Q32020
95-th percentile2020
Maximum2020
Range7
Interquartile range (IQR)1

Descriptive statistics

Standard deviation0.7786008224
Coefficient of variation (CV)0.0003855647331
Kurtosis3.055579037
Mean2019.377696
Median Absolute Deviation (MAD)0
Skewness-1.598626281
Sum185370795
Variance0.6062192406
MonotocityNot monotonic
2020-12-11T10:16:55.324947image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%) 
20204594450.1%
 
20194020943.8%
 
201755346.0%
 
2018820.1%
 
201519< 0.1%
 
20137< 0.1%
 
20141< 0.1%
 
ValueCountFrequency (%) 
20137< 0.1%
 
20141< 0.1%
 
201519< 0.1%
 
201755346.0%
 
2018820.1%
 
ValueCountFrequency (%) 
20204594450.1%
 
20194020943.8%
 
2018820.1%
 
201755346.0%
 
201519< 0.1%
 

hour
Real number (ℝ≥0)

ZEROS

Distinct24
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean13.23760295
Minimum0
Maximum23
Zeros3788
Zeros (%)4.1%
Memory size717.2 KiB
2020-12-11T10:16:55.408966image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q19
median14
Q318
95-th percentile22
Maximum23
Range23
Interquartile range (IQR)9

Descriptive statistics

Standard deviation5.968824943
Coefficient of variation (CV)0.4508992276
Kurtosis-0.4269631996
Mean13.23760295
Median Absolute Deviation (MAD)4
Skewness-0.4995797965
Sum1215159
Variance35.6268712
MonotocityNot monotonic
2020-12-11T10:16:55.492985image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=24)
ValueCountFrequency (%) 
1765397.1%
 
1664447.0%
 
1461966.7%
 
1557686.3%
 
1856756.2%
 
1355276.0%
 
1250685.5%
 
1146155.0%
 
1945264.9%
 
1042534.6%
 
Other values (14)3718540.5%
 
ValueCountFrequency (%) 
037884.1%
 
118642.0%
 
214341.6%
 
312251.3%
 
411541.3%
 
ValueCountFrequency (%) 
2328893.1%
 
2233553.7%
 
2136304.0%
 
2039354.3%
 
1945264.9%
 

victim
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct14
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.3704518715
Minimum0
Maximum15
Zeros66699
Zeros (%)72.7%
Memory size717.2 KiB
2020-12-11T10:16:55.575003image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q31
95-th percentile2
Maximum15
Range15
Interquartile range (IQR)1

Descriptive statistics

Standard deviation0.7400402741
Coefficient of variation (CV)1.997669147
Kurtosis17.25114774
Mean0.3704518715
Median Absolute Deviation (MAD)0
Skewness3.17639185
Sum34006
Variance0.5476596073
MonotocityNot monotonic
2020-12-11T10:16:55.648020image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=14)
ValueCountFrequency (%) 
06669972.7%
 
11943721.2%
 
237074.0%
 
311881.3%
 
44650.5%
 
51750.2%
 
6680.1%
 
734< 0.1%
 
812< 0.1%
 
95< 0.1%
 
Other values (4)6< 0.1%
 
ValueCountFrequency (%) 
06669972.7%
 
11943721.2%
 
237074.0%
 
311881.3%
 
44650.5%
 
ValueCountFrequency (%) 
151< 0.1%
 
121< 0.1%
 
112< 0.1%
 
102< 0.1%
 
95< 0.1%
 

season
Categorical

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size717.2 KiB
Summer
61719 
Spring
21898 
Fall
6175 
Winter
 
2004
ValueCountFrequency (%) 
Summer6171967.2%
 
Spring2189823.9%
 
Fall61756.7%
 
Winter20042.2%
 
2020-12-11T10:16:55.746042image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-12-11T10:16:55.812057image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:55.907079image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length6
Median length6
Mean length5.865462547
Min length4

Interactions

2020-12-11T10:16:26.602528image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:26.711610image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:26.822639image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:26.940675image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:27.057741image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:27.174746image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:27.299815image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:27.409878image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:27.534919image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:27.649985image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:27.759009image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:27.878036image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:27.983060image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:28.101087image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:28.211112image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:28.322136image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:28.438162image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:28.553189image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:28.666213image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:28.786240image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:28.899267image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:29.024294image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:29.140321image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:29.251346image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:29.371373image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:29.478397image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:29.597424image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:29.713451image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:29.842479image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:29.978509image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:30.103538image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:30.227566image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:30.361595image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:30.486624image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:30.622655image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:30.747683image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:30.867710image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:30.996739image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:31.498851image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:31.624879image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:31.744907image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:31.866934image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:31.991962image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:32.116990image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:32.242018image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:32.373048image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:32.494075image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:32.632107image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:32.762136image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:32.888164image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:33.020194image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:33.139221image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:33.271250image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:33.387277image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:33.505303image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:33.626331image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:33.747358image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:33.867385image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:33.993413image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:34.111441image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:34.242471image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:34.375499image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:34.494526image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:34.625557image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:34.741582image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:34.871611image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:34.992638image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:35.114666image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:35.242695image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:35.370723image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:35.497752image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:35.626781image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:35.747808image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:35.885840image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:36.149899image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:36.274927image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:36.405958image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:36.523984image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:36.654013image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:36.765039image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:36.877064image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:36.992090image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:37.105116image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:37.217141image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:37.334166image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:37.444191image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:37.567220image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:37.681246image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:37.790270image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:37.913297image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:38.018321image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:38.136347image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:38.260375image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:38.386404image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:38.515433image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:38.644462image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:38.772491image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:38.906521image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:39.031550image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:39.173581image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:39.304611image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:39.430640image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:39.566670image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:39.688698image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:39.821727image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:39.936754image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:40.051779image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:40.170806image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:40.289833image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:40.408860image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:40.532887image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:40.646913image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:40.777942image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:40.898970image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:41.013995image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:41.140024image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:41.251050image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:41.374077image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:41.484101image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:41.761163image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:41.877190image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:41.994216image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:42.109242image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:42.231270image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:42.346295image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:42.473324image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:42.591351image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:42.707377image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:42.830404image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:42.939429image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:43.062457image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:43.183484image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:43.306511image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:43.434541image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:43.562569image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:43.686597image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:43.818627image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:43.944656image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:44.082686image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:44.209715image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:44.332743image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:44.463772image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:44.581799image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:44.711828image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:44.817852image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:44.925876image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:45.036901image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:45.148927image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:45.259952image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:45.377979image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:45.486003image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:45.608030image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:45.721056image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:45.829080image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:45.948108image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:46.051132image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:46.168157image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:46.285183image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:46.404211image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:46.526238image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:46.652267image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:46.773294image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:46.902322image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:47.023350image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:47.158380image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:47.282408image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:47.404436image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:47.534464image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:47.650994image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Correlations

2020-12-11T10:16:56.012102image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2020-12-11T10:16:56.247156image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2020-12-11T10:16:56.483209image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2020-12-11T10:16:56.730265image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.
2020-12-11T10:16:56.997325image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2020-12-11T10:16:48.021573image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-11T10:16:49.320122image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Sample

First rows

collision_idcrash_datecrash_timeboroughzip_codelatitudelongitudelocationon_street_namenumber_of_persons_injurednumber_of_persons_killednumber_of_pedestrians_injurednumber_of_pedestrians_killednumber_of_cyclist_injurednumber_of_cyclist_killednumber_of_motorist_injurednumber_of_motorist_killedcontributing_factor_vehicle_1contributing_factor_vehicle_2vehicle_type_code1vehicle_type_code2vehicle_type_code1_numvehicle_type_code2_numdaymonthyearhourvictimseason
041822492019-08-0317:25STATEN ISLAND10307.040.501465-74.245230(40.501465, -74.24523)SWINNERTON STREET00000000UnspecifiedUnspecifiedStation WagonSport Utility Vehicle10382019170Summer
142011152019-09-070:36STATEN ISLAND10307.040.503310-74.237465(40.50331, -74.237465)SPRAGUE AVENUE00000000Unsafe SpeedUnspecifiedStation WagonSport Utility Vehicle1079201900Summer
241981602019-08-1715:00STATEN ISLAND10307.040.503387-74.248830(40.503387, -74.24883)FINLAY STREET00000000UnspecifiedUnspecifiedSedanUnspecified021782019150Summer
336643772017-05-062:55STATEN ISLAND10307.040.503414-74.244960(40.503414, -74.24496)Unspecified00000000Alcohol InvolvementUnspecifiedStation WagonSport Utility Vehicle1065201720Spring
443271592020-07-0820:20STATEN ISLAND10307.040.504470-74.243454(40.50447, -74.243454)HYLAN BOULEVARD00000000Failure to Yield Right-of-WayUnspecifiedSedanSedan01872020200Summer
543266282020-07-0617:00STATEN ISLAND10307.040.504482-74.247270(40.504482, -74.24727)Unspecified10100000UnspecifiedUnspecifiedStation WagonSport Utility Vehicle10672020171Summer
641671672019-07-0912:50STATEN ISLAND10307.040.505527-74.238190(40.505527, -74.23819)HYLAN BOULEVARD20000020Following Too CloselyUnspecifiedStation WagonSport Utility Vehicle10972019122Summer
741739872019-07-2016:30STATEN ISLAND10307.040.506187-74.234900(40.506187, -74.2349)HYLAN BOULEVARD10000010Failure to Yield Right-of-WayUnspecifiedStation WagonSport Utility Vehicle102072019161Summer
843072542020-03-154:10STATEN ISLAND10307.040.506187-74.234900(40.506187, -74.2349)JOLINE AVENUE10100000Driver InattentionDistractionSedanUnspecified02153202041Winter
943318482020-07-187:26STATEN ISLAND10307.040.506187-74.234900(40.506187, -74.2349)HYLAN BOULEVARD10000010Driver InattentionDistractionSedanUnspecified02187202071Summer

Last rows

collision_idcrash_datecrash_timeboroughzip_codelatitudelongitudelocationon_street_namenumber_of_persons_injurednumber_of_persons_killednumber_of_pedestrians_injurednumber_of_pedestrians_killednumber_of_cyclist_injurednumber_of_cyclist_killednumber_of_motorist_injurednumber_of_motorist_killedcontributing_factor_vehicle_1contributing_factor_vehicle_2vehicle_type_code1vehicle_type_code2vehicle_type_code1_numvehicle_type_code2_numdaymonthyearhourvictimseason
9178643380032020-08-137:55BRONX10471.040.910114-73.898544(40.910114, -73.898544)Unspecified00000000UnspecifiedUnspecifiedStation WagonSport Utility Vehicle10138202070Summer
9178743488932020-09-1415:04BRONX10471.040.910130-73.903220(40.91013, -73.90322)WEST 261 STREET00000000Driver InattentionDistractionSedanUnspecified021492020150Summer
9178841771872019-07-2420:10BRONX10471.040.910130-73.903220(40.91013, -73.90322)RIVERDALE AVENUE00000000UnspecifiedUnspecifiedStation WagonSport Utility Vehicle102472019200Summer
9178943446212020-09-041:51BRONX10471.040.910200-73.896614(40.9102, -73.896614)WEST 262 STREET20000020UnspecifiedUnspecifiedSedanOther0749202012Summer
9179043507212020-09-236:30UnspecifiedUnspecified40.910320-73.897640(40.91032, -73.89764)WEST 262 STREET00000000UnspecifiedUnspecifiedSedanUnspecified02239202060Fall
9179143090782020-04-1515:26BRONX10471.040.910320-73.897640(40.91032, -73.89764)HUXLEY AVENUE00000000Driver InattentionDistractionSedanUnspecified021542020150Spring
9179241769352019-07-2417:03BRONX10471.040.910380-73.896630(40.91038, -73.89663)Unspecified00000000UnspecifiedUnspecifiedStation WagonSport Utility Vehicle102472019170Summer
9179343467412020-09-113:30BRONX10471.040.912018-73.900000(40.912018, -73.9)WEST 263 STREET00000000UnspecifiedUnspecifiedSedanSedan01119202030Summer
9179441706582019-07-1510:17BRONX10471.040.912117-73.902680(40.912117, -73.90268)Unspecified00000000UnspecifiedUnspecifiedTaxiUnspecified221572019100Summer
9179543152392020-05-2223:55BRONX10471.040.912170-73.900770(40.91217, -73.90077)Unspecified00000000Unsafe SpeedUnspecifiedStation WagonSport Utility Vehicle102252020230Spring